Latent Document Re-Ranking

نویسندگان

  • Dong Zhou
  • Vincent P. Wade
چکیده

The problem of re-ranking initial retrieval results exploring the intrinsic structure of documents is widely researched in information retrieval (IR) and has attracted a considerable amount of time and study. However, one of the drawbacks is that those algorithms treat queries and documents separately. Furthermore, most of the approaches are predominantly built upon graph-based methods, which may ignore some hidden information among the retrieval set. This paper proposes a novel document reranking method based on Latent Dirichlet Allocation (LDA) which exploits the implicit structure of the documents with respect to original queries. Rather than relying on graphbased techniques to identify the internal structure, the approach tries to find the latent structure of “topics” or “concepts” in the initial retrieval set. Then we compute the distance between queries and initial retrieval results based on latent semantic information deduced. Empirical results demonstrate that the method can comfortably achieve significant improvement over various baseline systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Re-ranking Method Based on Topic Word Pairs

How to improve the rankings of the relevant documents plays a key role in information retrieval. In this paper, a re-ranking approach based on topic words pair is proposed to improve precision while recall is preserved. The topic word pairs contain two correlated words, one of which is the original query word and the other come from the documents. The selection is based on Probabilistic Latent ...

متن کامل

Dual-Space Re-ranking Model for Document Retrieval

The field of information retrieval still strives to develop models which allow semantic information to be integrated in the ranking process to improve performance in comparison to standard bag-ofwords based models. A conceptual model has been adopted in generalpurpose retrieval which can comprise a range of concepts, including linguistic terms, latent concepts and explicit knowledge concepts. O...

متن کامل

The Effectiveness of Results Re-Ranking and Query Expansion in Cross-language Information Retrieval

This paper presents the technique details and experimental results of the information retrieval system with which we participated at the NTCIR-8 ACLIA (Advanced Cross-language Information Access) IR4QA (Information Retrieval for Question Answering) task. Document corpus in Simplified Chinese (CS) and Traditional Chinese (CT) with topics in English, CS and CT were used in our experiments. We com...

متن کامل

Language Modeling and Document Re-Ranking: Trinity Experiments at TEL@CLEF-2009

This paper presents a report on our participation in the CLEF-2009 monolingual and bilingual ad hoc TEL@CLEF tasks involving three different languages: English, French and German. Language modeling is adopted as the underlying information retrieval model. While the data collection is extremely sparse, smoothing is particular important when estimating a language model. The main purpose of the mo...

متن کامل

Smoothing Methods and Cross-Language Document Re-ranking

This paper presents a report on our participation in the CLEF 2009 monolingual and bilingual ad hoc TEL@CLEF task involving three different languages: English, French and German. Language modeling was adopted as the underlying information retrieval model. While the data collection is extremely sparse, smoothing is particularly important when estimating a language model. The main purpose of the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009